Question

Say I have a binary file of 12GB and I want to slice 8GB out of the middle of it. I know the position indices I want to cut between.

How do I do this? Obviously 12GB won't fit into memory, that's fine, but 8GB won't either... Which I thought was fine, but it appears binary doesn't seem to like it if you do it in chunks! I was appending 10MB at a time to a new binary file and there are discontinuities on the edges of each 10MB chunk in the new file.


In [1]:
def copypart(src, dest, start, length, bufsize=1024*1024):
    f1 = open(src,'rb')
    f1.seek(start)

    f2 = open(dest,'wb')

    while length:
        chunk = min(bufsize,length)
        data = f1.read(chunk)
        f2.write(data)
        length -= chunk

    f1.close()
    f2.close()

In [2]:
def copypart(src, dest, start, length, bufsize=1024*1024):
    with open(src,'rb') as f_src:
        f_src.seek(start)
        with open(dest,'wb') as f_dest:
            while length:
                chunk = min(bufsize,length)
                data = f_src.read(chunk)
                f_dest.write(data)
                length -= chunk

In [ ]: